SCIENTIFIC 

REPORTS 




SUBJECT AREAS: 

STATISTICAL PHYSICS, 
THERMODYNAMICS AND 
NONLINEAR DYNAMICS 

PHYSICS 

TECTONICS 

INFORMATION THEORY AND 
COMPUTATION 



Received 
10 August 2012 

Accepted 
26 October 2012 

Published 
14 November 2012 



Correspondence and 
requests for materials 
should be addressed to 
E.L. (eugenio. 
lippiello@unina2.it) 



Spatial organization of foreshocks as a 
tool to forecast large earthquakes 

E. Lippiello 1 , W. Marzocchi 2 , L. de Arcangelis 3 & C. Godano 1 

1 Department of Mathematics & Physics and CNISM, Second University of Naples, 8 1 1 00 Caserta, Italy, 2 lstituto Nazionale 
Geofisica Vulcanologia, 00143 Roma, Italy, department of Industrial and Information Engineering and CNISM Second University 
of Naples, 81031 Aversa (CE), Italy. 

An increase in the number of smaller magnitude events, retrospectively named foreshocks, is often observed 
before large earthquakes. We show that the linear density probability of earthquakes occurring before and 
after small or intermediate mainshocks displays a symmetrical behavior, indicating that the size of the area 
fractured during the mainshock is encoded in the foreshock spatial organization. This observation can be 
used to discriminate spatial clustering due to foreshocks from the one induced by aftershocks and is 
implemented in an alarm-based model to forecast m > 6 earthquakes. A retrospective study of the last 19 
years Southern California catalog shows that the daily occurrence probability presents isolated peaks closely 
located in time and space to the epicenters of five of the six m > 6 earthquakes. We find daily probabilities as 
high as 25% (in cells of size 0.04 X 0.04deg 2 ), with significant probability gains with respect to standard 
models. 

The existence of different initiation processes leading to large and small earthquakes is still a central question 
in the debate on seismic predictability 1 . The organization in space, time and magnitude of seismicity before 
mainshocks probably represents the most suitable tool to enlighten possible differences. The central ques- 
tion is the existence of some features that discriminate events before large shocks from other earthquakes. 
Foreshocks are usually retrospectively identified on different temporal scales, ranging from hours up to months 
before mainshocks, and, in general, their number increases as the mainshock time is approaching, consistently 
with a power law 2 " 4 . Some studies have suggested that the magnitude distribution of foreshocks is different from 
the distribution of other earthquakes 5 " 7 and have shown correlations between the size of the foreshock zone and 
the magnitude of the subsequent earthquake 8 " 11 . These observations are consistent with a scenario where fore- 
shocks are the manifestation of an initiation process leading to the mainshock 1213 . However, the above features 
have been attributed to biases introduced by the foreshock selection criterion 14 " 17 . In fact, single mode triggering 
models, where mainshocks, aftershocks and foreshocks are treated on the same footing, are able to reproduce the 
above features of foreshock time-magnitude organization. In these models magnitudes are assumed to be inde- 
pendent of past seismicity and therefore no information on mainshock magnitude can be obtained from fore- 
shock properties. Within this approach seismicity before large earthquakes presents no distinct features. 

In this study we show that the spatial organization of foreshocks can be used to forecast large mainshock 
occurrence. We first consider small mainshocks following the original approach proposed by Felzer & Brodsky 
(FB) 18 to unveil the physical mechanisms behind aftershock occurrence. The great advantage of the FB approach 
is that a very large number of mainshock- aftershock couples can be analyzed, allowing for an accurate statistical 
study. 

Results 

Foreshocks before mainshocks with magnitude me [2,5]. According to the FB procedure (see Methods), an 
event is identified as a mainshock if it is sufficiently far in time and space from larger earthquakes. Aftershocks 
(foreshocks) are events occurring just after (before) the mainshock and close in space. We consider a linear 
density probability p(Ar) defined as the number of aftershocks (foreshocks) with epicenters at a distance in the 
interval [Ar, 1.2Ar] from the mainshock, divided by 0.2 Ar and by their total number, i.e. the linear density 
evaluated in ref.s 18, 19 divided by the total number of identified aftershocks (foreshocks). This normalization 
allows us to compare directly the functional form of the foreshock and aftershock distributions, even if their 
number are usually very different. In the left panel of Fig. 1 we compare p(Ar) for foreshocks and aftershocks in 
the Southern California Catalog 20 , with mainshock magnitude me[M,M-\- 1] and M = 2, 3, 4. Parameters of the 
FB procedure are listed in Table 1. The linear density probabilities before and after mainshocks are very similar in 
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Figure 1 | Aftershocks and foreshocks spatio-temporal organization in Southern California. Left upper panel. The linear density probability p(Ar) for 
foreshocks (filled circles) and aftershocks (empty diamonds) are obtained considering all events occurring within 12 hours from the mainshock. Different 
colors correspond to mainshocks in different magnitude classes, m e [M, M + 1), and M = 2, 3, 4 for black, red and green symbols respectively. We 
restrict the distribution to Ar < 3 km in order to reduce the contribution from background seismicity. Right upper panel The linear density probability 
p{ Ar) averaged over 50 independent realizations of synthetic catalogs generated by the ETAS model. Details on the numerical procedure are given in the 
Methods. Data for aftershocks and foreshocks in numerical catalogs are indicated as continuous lines and pluses, respectively. Open diamonds refer to the 
aftershock p{ Ar) in the experimental catalog. Black, red and green colors correspond to M = 2, 3, 4 respectively. Lower Panel. The inverse average distance 
R~ l (t) = J*^™* dAr(Ar) ~ l p(Ar,t) is plotted as function of time from the mainshock. Here p(Ar, t) is the linear density probability in the interval [— 1.2t, 
-t] ( [t y 1 .2 1] ) before (after) mainshocks for foreshocks (filled circles) and aftershocks (empty diamonds), respectively. We average 1/Ar, instead of Ar, in 
order to reduce the influence of background seismicity and, for the same reason, we fix R max = 3 km. The same symbols as in upper panels are used. 



the whole spatial range and for all values of M. Fig. 1 indicates that, 
not only for aftershocks but also for foreshocks, p(Ar) depends on the 
mainshock magnitude. We have explicitly verified that the 
distributions for different M tend to collapse on the same master 
curve if Ar is rescaled by 10' ?M , with r\ = 0.39 ±0.05, in agreement 
with the scaling form p{Ar) = L(M)F(Ar/L(M)) and L(M) = 0.05 
X 10' 7M km. This result, well established for the aftershock 
distribution 21 ' 22 indicates that the typical size of the foreshock zone 
also scales like 10 >?m with the mainshock magnitude m, in agree- 
ment with previous estimates for larger mainshocks 9 . In a recent 
publication the symmetry in the spatial organization of seismicity 
before and after M = 2 mainshocks has been attributed to arti- 
facts of the mainshock selection criterion 19 . In the supplementary 
information we have deeply checked the stability of the results 
with respect to different parameters of the FB procedure. Further- 
more, we have also verified that other mainshock selection 
criteria 21 ' 23 provide similar results. Here, we show that the FB 
mainshock procedure leads to expected features for the aftershock 
occurrence: First, we argue that a biased mainshock selection cannot 
produce a coherent pattern like the dependence of p(Ar) on the 
mainshock magnitude. Second, we analyze the inverse average 

distance from the mainshock = \^ max dAr(Ar) _1 p(Ar,t)J , 

with R max = 3 km (lower panel of Fig. 1). This quantity shows 
that, for all M, the mainshock occurrence time represents a 
singular point for £ -1 (f), i.e., £ -1 (r) grows before the mainshock 
occurrence and decreases after. This indicates that mainshocks are 
not part of a sequence triggered by large events that occurred before 
the time window used in the FB procedure. Finally, we find that both 



the number of aftershocks and foreshocks grow exponentially with 
the mainshock magnitude, consistently with a productivity law (see 
Supplementary Fig. 6). We note that the existence of reasonable pro- 
ductivity coefficients indicates that mainshocks are correctly selected. 

Once established the efficiency of the FB method in the identifica- 
tion of correlated couples, the crucial question is if events identified 
as foreshocks provide information on the subsequent mainshock 
magnitude. We emphasize that results reported in Fig. 1 (left panel) 
cannot be reproduced by single mode triggering models, where 
earthquakes are either independent (mainshocks) or triggered events 
(aftershocks). In these models, earthquakes identified as foreshocks 
are by construction mainshocks followed by a larger earthquake. 
Their asymptotic spatial decay is similar to the aftershock one, as 
observed in ref. 24, but p(Ar) only weakly depends on the mainshock 
magnitude class M. In order to explicitly verify this point we have 
performed extended numerical simulations of the ETAS model 25 ' 26 . 
More precisely we implement experimental parameters obtained 
from the likelihood maximization 27,28 in a numerical code (see 
Methods). We therefore apply the FB procedure to identify fore- 
shocks and aftershocks. Fig. 1 (right panel) shows that, as expected, 
the aftershock linear density probability depends on M whereas 
p(Ar) for foreshocks is weakly M-independent. As a consequence, 
p(Ar) for aftershocks and foreshocks are different, with discrepancies 
more pronounced for increasing M. 

On the other hand, some features of the organisation in time, space 
and magnitude of experimental foreshocks are recovered in ETAS 
catalogs. These are consequences of the fact that, in numerical cat- 
alogs, most of the events identified as mainshocks, and preceded by 
close-in-time earthquakes, are aftershocks triggered by smaller 
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Table 1 | Parametersof the FB criterion and number of mainshocks, 
aftershocks and foreshocks selected by the FB criterion for each 
mainshock class. Aftershocks (foreshocks) are events occurring 
within 3 km and 1 2 hours after (before) the mainshock 



FB parameters 


Value 


Description 




L 


100 km 


spatial range 




Xi 


3 days 


interval before 




X2 


1 2 hours 


interval after 






Mainshocks 


Aftershock 


Foreshock 


Mainshock class 


Number 


number 


number 


M 


remain 


n tot 


"tot 


2 


8919 


1057 


849 


3 


2332 


1875 


815 


4 


267 


1300 


307 



earthquakes. Therefore, stacking several couples of such events leads 
to apparent clustering features. For instance, numerical foreshocks 
exhibit a productivity law with the mainshock magnitude as well as 
an increase both in the number of events (inverse Omori law) and in 
the inverse average distance R~\t) as the time approaches the main- 
shock. However, the growth of R~ 1 (t) is not symmetrical for after- 
shocks and foreshocks differently than the observed behavior in the 
experimental catalog (Fig. 1 lower panel). We wish to stress that 
differences are also observed in the number of identified foreshocks. 
A recent study 14 , indeed, has evidenced a deficit of foreshocks in 
synthetic ETAS catalogs with respect to the number of foreshocks 
in real seismic catalogs. This discrepancy has been attributed either to 
lack of aftershocks due to catalog incompleteness, or to a potential 
indication of the existence of a preparatory process leading to main- 
shock occurrence. In the Suppl.Fig.s 6,7 we present results of numer- 
ical simulations of the ETAS model confirming the foreshock deficit. 
The number of foreshocks in the synthetic catalog also increases 
exponentially with the mainshock magnitude. However, the coef- 
ficient of this growth is always significantly smaller than the experi- 
mental value and does not depend on the catalog incompleteness (see 
Suppl.Fig. 6). 

Foreshocks before mainshocks with m > 6. The above observa- 
tions suggest that seismic spatial and temporal clustering represents a 
potential tool to forecast mainshock occurrence. In the following we 
develop an alarm-based model for m > 6 mainshocks that imple- 
ments results of Fig. 1 obtained for smaller mainshocks. We divide 
the Southern California region in cells of area dL = 0.04° X 0.04° and 
indicate with Xk the coordinates of the k-th cell center. We evaluate 
the daily expected number of earthquakes per cell A(xk,t), in the k-th 
cell position at the time t by means of the ETAS model (see Methods). 
Then, assuming that A(xkJ)<^l, the daily probability to have at least 
one event in the k-th cell is given by P^ TAS (t) ~A(xk,t). For each m 
> 6 mainshock, the quantity P^ TAS (t) is evaluated at the time t of the 
last m > 2.5 earthquake preceding the mainshock. All seismic maps 
(see Suppl. Fig. 8) present sharp maxima (in the range [2E — 3, 2E — 
2]) that are closely located (up to 5 kms) to the future mainshock 
epicenter, except for the Northridge earthquake. In this case a smaller 
peak (of amplitude 1.2E — 6) is located 25 km away from the future 
epicenter. The presence of sharp maxima in P k (t) can be attributed to 
the occurrence of small earthquakes near the mainshock epicenter 
from few days up to few minutes before the mainshocks. 

Within the ETAS scenario, these small events trigger other earth- 
quakes whose magnitude is randomly chosen from a Gutenberg- 
Richter law; therefore a sequence that anticipates a large event does 
not have any distinctive feature. Results of Fig. 1 show a different 
scenario. In particular, the lower panel of Fig. 1 provides a possible 
mechanism to discriminate between spatial clustering due to 



foreshocks and aftershocks. In the case of foreshocks, indeed, we 
expect that for each sequence the inverse average distance i?" 1 is an 
increasing function of time, whereas i?" 1 decreases soon after main- 
shock occurrence. This indicates that seismicity tends to reduce the 
spatial variability approaching the mainshock in a characteristic way, 
concentrating in an area surrounding the future epicenter, and then 
to spread out after the mainshock occurrence. This scenario is in 
agreement with recent observations of earthquake migration towards 
the rupture initiation point of the mainshock, during the month 
preceding the 2011 Tohoku-Oki Earthquake 29 . In the following we 
implement foreshock spatial clustering in a forecasting model. At 
each event occurrence time t, we define in position x the quantity 
R~ 1 (x,t) which represents the inverse distance averaged over the last 
n events, with m > 2.5, before time t. Moreover, we indicate with t h < 
t the occurrence time of the (n + l)-th earthquake before f, and 
evaluate R~ 1 (x,ty) as the inverse average distance for n events before 
t b . We then introduce the quantity (j) n (x,t)=R~ 1 (x,t)/R~ 1 (x,ty) 
which is expected to be <j> n > 1 before mainshock occurrence and 
0 n < 1 soon after. This result is confirmed in Suppl.Fig. 9 for all m > 
6 mainshocks. We then define a foreshock based (FS) alarm function 
A(xk,t) = CA n (xk,t)(j) n (xk,t) in the k-th cell at time t, where C is a 
constant and A n is the occurrence probability given by the ETAS 
model restricted to the last n earthquakes with m > 2.5. This defini- 
tion implies that our forecasting takes into account only n events 
closer in time to the future mainshock. When A(3cfc,f)<0, the daily 
probability to have at least one event in the k-th cell is finally given by 
Pl S (t) ~A(xk,t). The constant C is, then, fixed by the condition the 
total number of expected events with m > 6 in the FS and in the ETAS 
model coincide, i.e., \dt J2 k Ak(xk,t) = $ dt^2 h A(xk,t), where the 
sum extends to all cells and the integral covers the entire catalog 
duration. This new procedure introduces only one additional para- 
meter n with respect to the ETAS model. We have verified that results 
weakly depend on n for ne [10,50] , in the following we show results for 
n = 20. In Table 2 we list the daily occurrence probability in the cell 
containing the mainshock epicenter for the six m > 6 earthquakes, for 
both the ETAS and the FS model. Fig. 2 shows that before each 
mainshock a sharp maximum of P£ s is present at a small distance 
(few kilometers) from the future mainshock epicenter. Only in the 
case of the Northridge earthquake, the maximum location is 30 kms 
from the epicenter. For all mainshocks, in the cells containing the 
future epicenter, P™ is larger than the value obtained by the ETAS 
model. The most striking result is for the Landers earthquake, where a 
daily occurrence probability 24.75% is observed in the cell containing 
the future epicenter. Cells with even higher probabilities (39.2%) are 
observed before the Superstition Hill earthquake, where the main- 
shock epicenter is located at a distance of roughly 4 km from the 
maximum of P^. The small value of P£ s (t) before the Northridge 
earthquake can be attributed to the lack of close in time earthquakes 
leading to a very small A n . Interestingly, the function <j> n (x,t) shows 
evidence of spatial clustering even in this case (see Suppl.Fig. 9). The 
joint occurrence probability of all six m > 6 events, T1 FS is simply given 
by the product of the six P£ s (f) at the time t just before the six 



Table 2 | Information about forecasting maps. For each m > 6 
earthquake the table lists the value of the probability evaluated 
by the FS model at the mainshock epicenter, and the same quant- 
ity evaluated by the ETAS model 



Mainshock Mainshock Probability FS ETAS Probability 



SuperstitionHill 0.007 5E-4 

JoshuaTree 0.04 2E-3 

Landers 0.25 7E-3 

BigBear 0.0025 8E-4 

Northridge 1 .2E-6 1E-7 

HectorMine 0.075 1 E-3 
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Figure 2 | Daily occurrence probability just before m > 6 earthquakes. The probability P F k s to have a m > 6 earthquake within l day in a cell of side 0.04°, 
is evaluated for the 6 largest events in Southern California just after the occurrence of the last event before mainshock. For each mainshock, we plot P^ s 
over the entire Southern California region (upper panels), and a zoom over a box centered in the future mainshock epicenter (front panels). Black stars 
indicate the mainshock epicenter location and the values of P FS can be obtained from the color code bar. 
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Figure 3 | Space-time Molchan diagrams. The Molchan trajectories of the 
FS model relative to the RI reference model (black filled circles) and to the 
ETAS reference model (blue open diamonds). Points on the descending 
diagonal (red curve) indicate equivalent model performance. For points 
below the dot-dashed green line, the null hypothesis (equivalent 
probability for the two models) can be rejected with a confidence level 
larger than 99,99%. 

mainshocks in the cells including the epicenter. The same quantity is 
evaluated for the ETAS model. We obtain a ratio U FS /U ETAS = 2.8E7 
indicating a significant gain for the FS model with respect to the ETAS 
model. 

In Fig. 3 we apply a standard procedure outlined in ref.30, 31 to 
compare the FS forecasting model with the ETAS and the relative 
intensity (RI) model 32 . In the latter model, the occurrence probability 
Pf is time independent and implements spatial clustering on the 
basis of smoothed historical seismicity. In the following is 
obtained using the parameters suggested by Rundle et al. and used 
in 31 to compare the RI model with a pattern informatics (PI) 
model 33 and with the United States Geological Survey National 
Seismic Hazard Map (NSHM) 34 . Results have shown that neither 
PI nor NSHM provide significant performance gain relative to the 
RI reference model. The comparison is performed by means of stand- 
ard Molchan diagrams, a plot of miss rate H versus the fraction of 
space-time occupied by alarms t. The value (0, 0) in the plot repre- 
sents perfect prediction with zero missed events and perfectly loca- 
lized alarms. The descending diagonal from (0, 1) to (1, 0), 
conversely, represents the line with performance gain G = (1 — 
H)/t = 1, i.e., the same performance for the two models. We use this 
procedure to compare the different models. More precisely, we 
evaluate Pl s (t) in all cells and, for different thresholds P th , we declare 
an alarm at the time t in those cells with P£ s (f) >Pth- In this way we 
obtain 6 different values of miss rates. For each value of P^, the 
fraction of space-time occupied by alarms is given by the integral 
in time and space of Pf(P^ TAS (/:)) over all cells with Pf (f) >P th . All 
points below the descending diagonal indicate that the FS model 
performs better than the other model. In particular, perfect predic- 
tion by the FS model corresponds to points lying on the vertical axis. 
Fig. 3 indicates that the FS model performsjnuch better than the 
other models. We introduce the average gain G = (l — H) /t, where H 
and t are the average values of H and t for the six points on the 
Molchan diagram. The FS model exhibits an average gain with 
respect to the RI model G = 50.7, which becomes G = 38320 if the 
Northridge earthquake is not included in the average. The compar- 
ison with the ETAS model gives G = 4.5. This result is weakly affected 
by the implementation of different spatial kernels in the ETAS model. 



Discussion 

It is interesting to notice that the large improvement is obtained only 
on the basis of the last few events before mainshock and it does not 
contain any information on the fault structure and historical seismi- 
city. As final remark, we emphasize that all these results have been 
obtained retrospectively. An unbiased evaluation of the forecasting 
performances will be obtained through prospective tests. For this 
reason the next step will be to submit this model in the prospective 
experiments conducted by the Collaboratory for the Studies of 
Earthquake Predictability 35 . 

Methods 

Mainshock, aftershock and foreshock selection criterion. According to the FB 
method an event is identified as a mainshock if a larger earthquake does not occur in 
the previous y days and within a distance L. In addition a larger earthquake must not 
occur in the selected area in the following^ days. Typical values used by FB are L = 
100 km, y = 3 and y 2 = 0.5. Aftershocks and foreshocks are all events occurring, 
respectively, in the subsequent or in the preceding time interval 3t = 12 h and within 
a circle of radius R = 3 km from the mainshock epicenter. Other parameter values 
provide similar results (see Suppl. Fig. 1). 

ETAS model parameters and numerical simulations. In the ETAS model, the daily 
occurrence seismic rate in the position ~x at time t is evaluated on the basis of all the 
earthquakes with magnitude w f > m c , epicentral location ~Xi and occurrence time f f - 
< t and is given by 

A( *>,f) =fi(lt) + ^" ?)(m ' " mc) (l + tj Y) ~*g(b,m x - m c ) (1) 

where d { is the angular distance between ~x and ~Xi, and fi( ~x) is a time independent 
contribution related to the occurrence probability of Poisson events. Events with m < 
m c cannot trigger future earthquakes. 

For g(di,rm — m c )= — — ^ ^1 + — 1 ,^f'_ m j \ > the model parameters, ji y B, a, c, 

p, D, q and y, can be estimated by maximizing the likelihood function. In the specific 
case m c = 2.5 the parameters have been kindly provided by J. Zhuang who obtained 
them by means of an iterative algorithm 23 ' 27 ' 28 . Their values are B = 0.618, a = 1.198, c 
= 0.0024 days,p = 1.05, D = 9E - 7 deg 2 , q = 1.034. fi is finally obtained on the basis 
of the procedure described in ref.17. The quantity A n (lc,i) used in the evaluation of 
Pjp is given by Eq.(l) restricting the sum over to the n events occurring before t. The 
spatial kernel used for results in Fig. 1 is obtained according to the following pro- 
cedure: we fix for m = 2 g(Ar, m) = p(Ar), where p(Ar) is the linear density prob- 
ability obtained for M = 2 in the experimental catalog (see Fig. 1 left panel). For m > 
2, g(Ar, m) is obtained from g(Ar, m = 2) assuming the scaling relation p(Ar) = 
L{mY l F{ArlL{m)), that fits experimental data with L(m) = L(m = 2)10 04(m " 2) . 

Numerical simulations are performed according to the algorithm proposed by 
Zhuang et al. (Ref. 17). In order to simulate a catalog with m>m' c = 2, we assume that 
seismic properties do not depend on the lower magnitude threshold. This implies that 
the new set of parameters B', a', c',p', D', q', y', can be obtained from the m c = 2.5 
parameters, taking a' = a, d = c,p' = p, q' = qy' = y, B' = £e a ( mc ~ mc ) and 
D' = Be' , y nc ~ mc > . We have also implemented a different choice of model parameters 
and verified that results plotted in Fig. 1 (upper right panel) do not depend on the 
specific parameter set. 
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